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Well -Founded  Orderings  for  Proving 
Termination  of  Systems  of 
Rewrite  Rules 


ABSTRACT 


Well-founded  partial   orderings  can  be  used  to  prove  the  termination 
of  programs,  and  can  also  be  used  for  algebraic  simplification.     A  new 
class  of  well-founded  orderings  is  presented  which  can  be  used  to  prove 
the  termination  of  programs  expressed  as  sets  of  rewrite  rules.     The 
orderings  are  syntactically  defined  in  terms  of  a  lexicographic  ordering 
and  an  ordering  on  multisets  and  require  an  ordering  on  the  function  and 
constant  symbols  to  be  specified.     This  technique  of  proving  termination 
appears  to  be  of  practical    importance,  because  it  is  able  to  handle  re- 
write rules  that  arise  from  typical   recursive  programs.     Several   efficient 
algorithms  are  presented  which  allow  the  termination  of  a  set  of  rewrite 
rules  to  be  verified  in  linear  time,  in  cases  to  which  this  method  applies. 
These  results  can  be  viewed  in  terms  of  an  incomplete  system  of  logic 
in  which  short  termination  proofs  exist.     The  well-founded  orderings  may 
also  be  useful   for  proofs  by  mathematical    induction  in  various  areas  of 
mathematics.     A  general   characterization  of  a  class  of  rewrite  rules  is 
presented,  for  which  termination  can  be  proved  using  these  orderings. 


1.  Introduction 

Many  programs  can  conveniently  be  expressed  as  sets  of  term 
rewriting  rules.  For  example,  a  program  to  compute  the  factorial  function 
can  be  expressed  as  follows: 

fact(s(x))  ■*  s(x)  *  fact(x) 
fact  (0)  ■+  1 
The  append  function  in  LISP  can  be  expressed  this  way: 

append  (cons(x,y)  ,z)  -»■  cons(x,append(y,z)) 

append  (NIL,z)  -*  z 
We  are  interested  in  efficient,  general  methods  for  proving  that  such  sets 
of  rewrite  rules  always  terminate.  Although  this  is  in  general  an  un- 
decidable  problem  [6],  we  have  found  a  method  that  works  well  in  many 
cases  of  practical  interest.  For  related  work,  see  [4],  [5],  [7],  [10],  [11] 


Suppose  that 


sl  ■*■*! 


s  +  t 
n    n 


is  a  set  of  rewrite  rules.  We  have  found  a  class  of  partial  orderings 
on  terms  with  the  following  properties: 

1.  If  t-  <  s.  in  the  ordering  for  1  <  i  <  n,  then  the  set 

ii  J  —      — 

of  rewrite  rules  is  guaranteed  to  always  terminate. 

2.  Given  s^   and  t. ,   there  is  an  efficient  procedure  for 
deciding  if  t.   <  s.    in  the  ordering. 

This  paper  presents  an  efficient,  simple  method  for  proving  termination  ol 
sets  of  rewrite  rules  in  many  cases  of  practical    interest.     Common  cases 
that  our  method  cannot  deal   with  are  also  presented. 
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Partial  orderings  of  the  kind  we  investigate  are  useful  in 
devising  general  methods  for  algebraic  simplification.  Some  of  the  most 
powerful  theorem  provers  currently  in  use  rely  heavily  on  general  sim- 
plification procedures  ([1],  [2],  [3],  [12],  [13]).  Also  Lankford  [9] 
has  described  a  complete  theorem  proving  strategy  for  the  first-order 
predicate  calculus  which  is  based  on  the  concept  of  simplification.  See 
[8]  for  another  application  of  such  orderings. 

Our  approach  is  related  to  that  of  Dershowitz  [4].  He 
presents  many  specialized  methods,  some  of  which  can  provide  termina- 
tion proofs  that  our  approach  cannot  provide.  We  present  a,  single., 
general  method  which  can  handle  cases  not  covered  by  Dershowitz's 
techniques.  Huet  [5]  is  primarily  concerned  with  showing  that  term  re- 
writing systems  are  "confluent",  and  not  with  exhibiting  particular  well- 
founded  orderings.  Iturriaga's  work  [7]  is  hard  to  understand  and  not 
generally  available.  Lipton  and  Snyder  [10]  give  another  method  for  proving 
termination. 

Of  all  the  above  approaches,  only  Dershowitz's  approach  and  the 
approach  presented  here  are  suitable  for  proving  termination  of  rewrite 
rules  of  the  kind  obtained  when  writing  recursive  programs.  Dershowitz's 
nested  multiset  ordering  is  directly  applicable  only  to  one  function  symbol 
at  a  time,  whereas  the  ordering  presented  here  is  directly  applicable  to 
many  function  symbols  at  the  same  time.  Also,  we  give  explicit  programs 
for  computing  the  ordering  and  analyze  their  time  complexity.  In  addition, 
we  give  a  general  characterization  of  a  class  of  rewrite  rules  all  of  which 
can  be  handled  by  our  ordering.  Both  of  these  techniques  seem  to  be  new 
in  this  area,  although  Dershowitz  may  use  the  latter.  Perhaps  Iturriaga's 
ordering  can  be  adapted  to  our  purposes,  but  he  gives  neither  an  explicit 
program  nor  a  general  characterization  for  his  ordering. 
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2.  Simplification  Orderings 

Definition  A  term  is  an  expression  formed  from  function  symbols, 
constant  symbols,  and  variables,  properly  combined,  Thus  f(x,g(c,y))  is 
a  term.  We  consider  constant  symbols  as  function  symbols  having  no 
arguments.  A  term  without  variables  in  it  is  a  ground  term.  We  use  the 
usual  notion  of  substitution.  A  substitution  is  considered  as  a  multiple 
replacement,  simultaneously  replacing  all  variables  of  a  term  by  terms. 
Thus  {x  <-   f(x),  y  -e  g(c)}  is  a  substitution.  We  assume  all  function  and 
constant  symbols  are  taken  from  some  set  F  of  symbols. 

Definition:  We  say  an  ordering  "<"  is  well-founded  if  there 
is  no  infinite  sequence  t-, ,  t~,  t3,  ...  such  that  t-  >  t-+,  for  all 
i  _>  1.  (Such  a  sequence  t, ,  t~,  t3,  ...  is  called  an  infinite  descending 
sequence. ) 

Definition:  A  partial  ordering  "<"  on  terms  is  a  simplification 
ordering  if  it  has  the  following  four  properties: 

1.  It  is  a  total  ordering  on  ground  terms. 

2.  It  is  a  well-founded  ordering.  (That  is,  there  are  no 
infinite  descending  sequences.) 

3.  (Consistency  with  respect  to  substitution)  If  t-,  and  tp 
are  terms  and  t-,  <  t~  in  the  ordering,  then  for  all  substitutions  e,  t,0 
<t?e  in  the  ordering. 

4.  (Consistency  with  respect  to  subterm  replacement)  If  s,  is 
a  subterm  of  t, ,  and  tp  obtained  from  t-,  by  replacing  s-,  by  s?,  and  S-, 

<  Sp  in  the  ordering,  then  t,  <  t~  in  the  ordering. 

Intuitively,  these  are  desirable  properties  for  a  simplification 
ordering  to  have.  Property  2  guarantees  that  the  simplification  process 
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must  terminate.  Property  4  guarantees  that  simplifying  a  subterm  will 
also  simplify  the  whole  term,  so  simplification  can  be  done  "locally." 

Definition:  A  partial  ordering  on  terms  is  a  partial  simplifi- 
cation ordering  if  it  has  properties  2,  3,  and  4  as  above.  Thus  it  need 
not  be  a  total  ordering  on  ground  terms. 

Theorem  2.1.  If  s  is  a  subterm  of  t  then  s  is  never  greater 
than  t  in  any  partial  simplification  ordering. 

Theorem  2.2.  If  s  is  a  ground  term  and  t  is  a  non-ground 
term,  then  s  is  never  greater  than  t  in  any  partial  simplification  ordering, 

A  replacement  is  an  equation  that  can  only  be  used  in  one 
direction.  We  write  s,  -*■  So  to  mean  that  any  instance  of  s-,  can  be 
replaced  by  the  corresponding  instance  of  s~. 

Definition:  A  replacement  s,  ■*  s^   is  a  simplification  with 
respect  to  a  simplification  ordering  if  s?  is  less  than  s,  with  respect 
to  the  ordering. 

Definition:  A  term  t  is  obtained  from  term  s  using  the 
replacement  s.  ->  t.  if  there  is  some  substitution  e  with  the  following 
properties: 

s.e  is  a  subterm  of  s,  and  t  is  obtained  from  s  by 

replacing  one  occurrence  of  s.e  in  s  by  t.8. 

Note  that  if  t-  <  s.  in  some  partial  simplification  ordering 
then  t.e  <  s.e  also  by  consistency  with  respect  to  substitution.  Hence 
t  <  s  by  consistency  with  respect  to  subterm  replacement. 
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We  say  that  a  set  s,  -*■  t, ,  s«  •*  to,  ...,  s  -*■  t     of  rewrite 
rules  fails  to  terminate  on  input  u,  if  there  is  an  infinite  sequence 
u, ,  u?,  u~,  ...  such  that  for  all  i  >_  1 ,  u.+,  is  obtained  from  Li- 
using  some  replacement  in  the  set.  If  no  such  infinite  sequence 
exists,  we  say  the  set  of  rewrite  rules  terminates  on  input  u, . 

Theorem  2.3.  Suppose  s,  -*•  t, ,  . . . ,  s  -*■  t  is  a  set  of 
rewrite  rules.  Suppose  there  is  a  partial  simplification  ordering 
"<"  such  that  t.  <  s-  in  the  ordering  for  1  <  1  <  n.  Then  the  set  of 
rewrite  rules  terminates  on  all  inputs. 

Proof:  Assume  the  rules  fail  to  terminate  on  input  u, . 
Let  u, ,  u?,  u3,  ...  be  an  infinite  sequence  of  terms  such  that  for  all 
1  L   1 »  ui+-i  is  obtained  from  u.  by  using  some  replacement  in  the  set 
of  rewrite  rules.  We  showed  above  that  u.,n  <  u.  in  the  ordering, 
for  all  i.  Hence  u-, ,  u^,  u^,  ...  is  an  infinite  descending  sequence 
in  the  ordering.  But  this  is  impossible,  because  a  partial  simplification 
ordering  is  well-founded.  Thus  the  given  set  of  rewrite  rules  is 
guaranteed  to  terminate  on  all  inputs. 

3.  Multisets 

A  multiset  is  a  set  in  which  an  element  can  occur  more  than  once. 
We  write  multiset  union  as  *  or  usually  u.  The  number  of  occurrences  of 
an  element  in  a  multiset  A  u  B  is  the  sum  of  the  number  of  its  occurrences 
in  A  and  B. 

Definition:  A  sequence  of  function  symbols  (or  sequence)  is  a 
sequence  of  function  and  constant  symbols  from  F. 
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Definition:     If  term  t  is  of  the  form  f (t, ,    ...    ,  t  )  then 
f  is  the  top-level   function  symbol   of  t.     Also,  {t-, ,    ...    ,  t  }  are 
top-level   subterms  of  t. 

Definition:     If  t  is  a  ground  term  of  form  f(t, ,    ...    ,  t  )  then 
a  path  in  t  is  a  sequence  beginning  with  f  and  followed  by  a  path  from 
some  top-level   subterm  of  t. 

Definition:     If  t  is  a  ground  term  then  Paths   (t)   is  the  multi- 
set of  paths  of  t.     Thus 

n 
Paths(f(t1,   ...    ,  t))  =  {f}  w  Paths(t-). 
1  n  i=l  1 

Paths(c)  =  {c}  for  constant  symbols  c.     Thus  Paths   (f (c,g(a,b) ))  =  {fc, 
fga.fgb}. 

Definition:     If  a  is  a  path  then  Subseq(a)   is  the  multiset  of 
subsequences  of  a. 

Thus  Subseq(fa)   =  {f,A}  Subseq(a) 
=   {f}  Subseq(a)  tw  Subseq(a) 
where  A  denotes  the  empty  sequence  and  concatenation  denotes  subsequence 
concatenation.     Hence  Subseq(fg)  =   {f,g,fg,A}. 

Definition:      If  S  is  a  finite  multiset  of  elements  ordered  by 
some  total   order  relation,   let  List  (S)  be  a  list  {s-, ,  s?,    ...    ,  s  }  of 
the  elements  of  S  in  non-increasing  order.     Each  element  must  appear  the 
same  number  of  times  in  List   (S)  as  in  S. 

Given  a  total   ordering  on  some  set  S,  we  define  a  total   ordering 
on  finite  multisets  of  elements  of  S  as  follows: 

Suppose  U  and  V  are  multisets  of  elements  of  S.     Suppose  U  f  V. 
Let  {u,,   u2,    ...    u,}  be  List   (U)  and  let  {v-, ,  v-,    ...    ,  v£}  be  List(V). 
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If  List  (U)  is  a  proper  prefix  of  List(V),  we  say  U  <  V.  If  List  (V)  is  a 
proper  prefix  of  List  (U),  we  say  V  <  U.  Otherwise,  let  j  be  min{i:  u,  f   v.} 
Then  we  say  U  <  V  if  u.  <  v.,  U  >  V  if  u,  >  v.. 

Note  that  if  the  ordering  on  S  is  well-founded,  so  is  the  order- 
ing on  multisets  of  elements  of  S. 

We  assume  that  there  is  a  total  ordering  on  the  function  and  con- 
stant symbols  F  appearing  in  terms  t  that  we  are  dealing  with.  If  g  >  f 
in  this  ordering,  we  intuitively  think  of  g  as  being  more  complicated  than 
f,  and  we  say  g  is  larger  than  f. 

We  order  sequences  of  symbols  lexicographically.  Suppose  F  = 
(f^,  f ^ ,  ...  f^)  and  G  =  (  gQ,  g-j ,  ...  ,  g  )  are  two  such  sequences  of 
symbols.  Suppose  F  |  G.  If  F  is  a  prefix  of  G,  we  say  that  F  <  Gin 
the  lexicographic  ordering.  If  G  is  a  prefix  of  F,  we  say  G  <  F.  Other- 
wise, let  j  be  min{i:  f .  ±  g.}.  We  say  F  <  G  if  f.  <  g. ,  F  >  G  otherwise. 
Although  this  is  not  a  well-founded  ordering  on  sequences,  useful  well- 
founded  orderings  can  be  obtained  from  it. 

4.  A  Subsequence  Ordering  on  Paths 

We  order  multisets  of  sequences  by  extending  the  sequence  ordering 
to  multisets  as  indicated  above.   If  a  and  3  are  paths,  we  say  a  <  $  in  the 
subsequence  ordering  if  Subseq(a)  <  Subseq(p)  in  the  multiset  ordering  for 
subsequences. 

Note  that  the  subsequence  ordering  on  paths  is  a  total  ordering. 
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Theorem  4.1 .  For  all  paths  a,  3  and  for  all  function  symbols  f, 
Subseq(a)  <  Subseq(3)  iff  Subseq(fa)  <  Subseq(fg). 

The  proof  is  quite  simple. 

Definition:  If  a  is  a  path  and  g  is  a  function  symbol,  let 
#  (a)  be  the  number  of  occurrences  of  g  in  a. 

Definition:  If  a  is  a  path,  let  maxf(a)  be  the  maximal 
function  symbol  in  a.  Let  maxf(a,3)  be  the  maximal  function  symbol  in 
a  or  3,  if  a  and  3  are  paths. 

Let  R(a,3)  be  the  following  relation  on  paths  a  and  3: 

Suppose  g  is  maxf(a,3).  Suppose  a  is  ougapg. .  .a  ga  .,  and 
3  is  3-,g32g. .  .3  g3  +1  where  #  (a)=m  and  #  (3)=n.  Then  R(a,3)  is  true  if 
m<n  or  if  (m=n  and  R(a.,3.)  is  true,  where  j=max{i :a.^3i })• 

Theorem  4.2.  Suppose  a  and  3  are  paths.  Then 

a)  a  <  3  in  the  path  ordering  iff  Subseq(a){f}  <  Subseq(3){f) 
in  the  path  ordering  and 

b)  a  <  3  in  the  path  ordering  iff  R(a,3)  is  true. 

Proof:  By  induction  on  maxf(a,3).  If  these  values  are  the 
same,  then  by  induction  on  max(#  (a),#  (3)). 

Note  incidentally  that  it  is  easy  to  show  that  R(a,3)  implies 
R(af,3f)  for  all  function  symbols  f. 

Assume  that  a  is  c^g^g. .  -am9am+-i  and  3  is  3-|g32g-  -  •  ^n96n+l 

where  #  (a)=m  and  #  (3)=n  and  g  is  maxf(a,  3). 
y        y 

We  show  R(a,3)  iff  n  <   3.   If  m  <  n,  then  Subseq(a)  <  Subseq(3) 

easily.   If  m=n,  then  List(a)  begins  with  gm  List(a  J.. )  and 

m+1 

List   (3)   begins  with  gm  List(3  +1). 
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Let  j  be  max{i:  a.  f   g^.  If  j  =  m+1  then  a  <  3  iff 

am+l  <  em+l .  Suppose  m  =  n  and  j  <  m  +  1 .  Then  Subseq(a)  = 
Subseq(a'a.gy)  and  Subseq(3)  =  Subseq(3' 3-gy)  for  some  y   where 
#  (Y)  =  m  -  j.  Suppose  a.  <  3-  in  the  path  ordering.  Now,  Subseq(a)  = 
Subseq(a'a.y)  *  Subseq(a'a .  ){g}Subseq(y)  and  Subseq(3)  =  Subseq(3' 3 -y )  * 
Subseq(3'3 -){g)  Subseq(y).   We  can  assume  inductively  that 
R(a,a.y;3l3-y)  iff  a'a.y  <  3'3-y  since  a'a.y  and  3'3-y  either  have 

J         J  J  J  J  J 

no  g  at  all  or  else  have  fewer  g  than  a  and  3  do.  Now,  a'a.y  is 
a1ga2g...gajaj+1g...amgam+1  and  3'3jY  is  31ge2g...gejSj+1g...BmgBm+r  i 

Also,  am+1  =  3m+1  ,  ...,  a.+1  =  3  -+1 •  Hence  R(a'a.y,3'3,Y)  is  true  iff 
R(a.a.+1,  3-3j+1)  is  true,  i.e.,  iff  R(a.a,+1,  3,a.+1)  is  true. 
By  repeated  application  of  the  note  following  the  theorem,  R(a.a.+1  3-a.+, ) 
is  true.  Hence  R (a'a.y,  3'3-y)  is  true  and  so  by  inductive  application 

J  J 

of  b),  a'a.y  <  3'3-y  in  the  ordering  on  paths. 

J  J 

It  is  not  difficult  to  see  that  Subseq(a'a .  ){g}  Subseq(y)  < 

•J 

Subseq(3'3.:){g}  Subseq(y)  iff  Subseq(a,){g}  Subseq(y)  <  Subseq(3  •  ){g) 

J  J  J 

Subseq(y) . 

This  is  because  these  two  expressions  are  the  same  on  all  subsequences 
beginning  with  j  or  more  g's.  Their  ordering  will  therefore  be  deter- 
mined by  subsequences  beginning  with  exactly  (j-1)  g's.  Of  these, 
all  subsequences  will  be  the  same  in  both  expressions  except  those 
beginning  with  the  first  j-1  g's. 
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Now,  we  can  assume  inductively  by  a)  that 
Subseq(a.Hg)  <  Subseq($ . ){g}.  (Since  we  assumed  a.  <  $..)   We  want  to  show 
that  for  all  sequences  x,  Subseq(a.){gx}  < 
Subseq(3 -){gx}.  Note  that  for  y,z e Subseq(a  .)  w 
Subseq(e, ),  ygx  <  zgx  iff  yg  <  zg. 

This  is  because  g  does  not  occur  in  y  or  z,  hence  neither  of  yg  and 
zg  is  a  prefix  of  the  other. 

We  thus  have  a  1-1,  order-preserving  mapping  between 
Subseq(a.){g}  *  Subseq(3  .  ){g}  and  Subseq(a,){gx}  w  Subseq(3 . ){gx}.  Hence 

J  J  J  J 

Subseq(a.){gx}  <  Subseq(3 .){gx}  iff  Subseq(a  .  ){g}  <  Subseq(3  •  ){g}. 

J  J  J  J 

Since  this  is  true  for  each  x  in  Subseq  (y ) ,  we  obtain  that 
Subseq(a.){g}Subseq(y)  <  SubseqCgJlglSubseqly) .  Hence  Subseq(a' a  .){g} 
Subseq(y)  <  Subseq(e'  3  .){g}Subseq(y ) .  Therefore  Subseq(a)  <  Subseq(B), 
and  we  have  shown  b)  by  induction. 

We  now  show  a).  Assume  j  is  as  above.  If  #  (a)  <  #  (3)  and 
f  <_  g,  then  a)  follows  easily.  Suppose  #  (a)  <  #  (3)  and  f  >  g.  Then 

List(a)  begins  with  f,m*gf,  (l^ggf, and  List(e)  begins  with  f ,n*gf ,(|J)*ggf ,. . . 

Hence  if  m<n  and  f>g,  Subseq(cf){f}  <  Subseq(3){f }.  Suppose  #q(a)  =  #g(3). 
Suppose  f>g.  Then  [_ist(Subseq(a){f })  and  List(Subseq(3){f>)  will  both  begin  with  f 

m*gf , (o)*99',r' •  •  •  >(m-i  ^s"1"  f'gmf-  After  this,  they  will  agree  on  all 
sequences  having  j  or  more  g's  at  the  beginning.  Also,  they  will  agree 
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on  all  sequences  having  j-1  g's  not  all  of  them  chosen  from  the  first 
j-1  g's.  Hence  the  ordering  on  a  and  6  will  be  determined  by  the  ordering 
on  Subseq(al.gy){f}  and  Subseq(3  -gy  Hf)  for  suitable  y.  As  before, 
Subseq(a.gy  ){f}=  Subseq(a,y  ){f }  *  Subseq(a  .){g}  Subseq(y){f}  and 

J  J  \J 

Subseq(3i9y){f}  =     Subseqfa  .y){f }  *     Subseq(3  -){g}  Subseq(y  )(f }.     Using 

J  J  J 

induction  and  an  argument  as  above,  we  obtain  that  Subseq(a){f}  <  Subseq(3)(f} 

Suppose  #a(a)  =  #a(3)  and  f  <  g.     In  this  case,  Subseq(a){f} 
y  y 

and  Subseq(3){f)  agree  on  all  subsequences  beginning  with  j  or  more  g's. 

They  also  agree  on  all  subsequences  beginning  with  j-1  g's  not  all 

chosen  from  among  the  first  j-1  g's.  Hence  the  ordering  on 

Subseq(a){f}  and  Subseq(3){f)  is  determined  by  the  ordering  on  Subseq(a.gy  ){f} 

and  Subseq(js.gy ){f},  as  above.  Using  induction  and  an  argument  as  above, 

we  obtain  that  Subseq(a){f}  <  Subseq(3)(f }.  This  completes  the  proof. 

Corollary  1:  For  all  paths  a, 3  and  all  function  symbols 
f,  Subseq(a)  <  Subseq(3)  iff  Subseq(af)  <  Subseq(3f). 

Proof:  Subseq(af)  =  Subseq(a)  w  Subseq(a){f}  and  Subseq(3f)  = 
Subseq(3)  *  Subseq(3){f }.  Also,  Subseq(a){f}  <  Subseq(3){f>  iff  Subseq(a)  < 
Subseq(3)  by  the  above  theorem. 

Corollary  2:  The  subsequence  ordering  on  paths  is  a  well- 
founded  ordering. 

Proof:  By  the  above  theorem,  Subseq(a)  <  Subseq(3)  iff  R(a,3). 
But  it  is  easy  to  see  that  the  relation  R  is  a  well-founded  partial 
ordering  on  paths. 

This  result  is  slightly  surprising,  since  the  multiset  ordering 
on  arbitrary  multisets  of  sequences  is  not  well-founded.  However,  if 
we  restrict  ourselves  to  multisets  of  sequences  obtained  from  paths, 
this  ordering  is  well-founded. 
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Definition:  If  a  is  a  path,  let  Desc(a)  be  the  multiset  of 
non-increasing  subsequences  of  a. 

Theorem  4.3.  For  all  paths  a  and  3,  Subseq(a)  <  Subseq($) 
iff  Desc(a)  <  Desc  (3). 

Proof:  By  induction  on  naxf(a-$).  Let  a  be  a,ga0g. .  .a  gar,+-) 
and  let  3  be  3-igBo9---B  93  +1 ,  where  g  =  maxf(a.e)  and  #  (a)  =  m  and 
#g(e)  =  n. 

We  show  Subseq(a)  <  Subseq(3)  implies  Desc(a)  <  Desc(3). 
It  is  clear  that  Subseq(a)  =  Subseq(3)  implies  Desc(a)  =  Desc  (3), 
and  Subseq(a)  >  Subseq(B)  implies  Desc(a)  >  Desc(3)  by  symmetry. 

Assume  that  Subseq(a)  <  Subseq(3).  If  m<n,  then  the  maximal 
element  of  Desc(a)  is  of  the  form  g  a'  and  the  maximal  element  of 
Desc(3)  is  of  the  form  gng '  where  #  (a1 )  =  0  and  #  (3' )  =  0.  Since 
g  a'  <  g  3'  in  the  lexicographic  ordering  on  sequences,  Desc(a)  <  Desc(3). 

Suppose  that  m=n.  Let  j  be  max{i  :a-j^3. }.  We  know  then  that 
Subseq(a-)  <  Subseq(3-)-  Also,  Desc(a)  and  Desc(3)  will  agree  on  all 
subsequences  beginning  with  j  or  more  g's,  and  on  all  subsequences 
beginning  with  j-1  g's  not  all  chosen  from  the  first  j-1  g's  of  a  and 
3.  Hence  the  largest  subsequences  on  which  Desc(a)  and  Desc(3)  will 
differ  will  begin  with  gJ~  followed  by  a  subsequence  not  containing 
any  g's.  Hence  Desc(a)  <  Desc(3)  iff  gJ~  Desc(a.a  .+, . . .a  +, )  < 

gJ"lDeSc(Bj6j+T-'Vl)- 

We  know  that  a.+1   =  3j+-|  »•  •  •  »am+i   =  Sm+-|  •     Also>  Subseq(a.)< 

Subseq(3j).     Hence  Subseq(a.a ,+] . . .am+1 )    <  Subseq(0.3j+1 • • -Bm+1 ) • 

We  can  assume  inductively,   therefore,  that  Desc(a.a.+-, . .  .a  +, )  < 

Desc(3j3j+r..3m+1).     Hence  gJ_1Desc(ajaj+1 . .  .am+] )   <  gJ"1Desc(3j3j+]  • .  -3m+1 ) 

and  the  theorem  is  proven. 
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Note  that  the  lexicographic  ordering  on  non-increasing 
sequences  of  function  symbols  is  a  well-founded  ordering.  Hence  the 
induced  multiset  ordering  on  multisets  of  such  sequences  is  also  well- 
founded.  This  gives  us  another  way  to  show  that  the  subsequence  ordering 
on  paths  is  well-founded. 

We  now  derive  some  results  which  are  related  to  an  efficient 
algorithm  for  computing  the  subsequence  ordering  on  paths. 

Theorem  4.4.  Suppose  x  is  the  maximal  subsequence  of  a  path 
a.  Then  x  is  a  non-increasing  sequence  and  the  last  symbol  of  x  is  the 
same  as  the  last  symbol  of  a. 

Theorem  4.5.  Suppose  x  and  y  are  the  maximal  subsequences  of 
a  and  3,  respectively.  Suppose  the  last  symbols  of  a  and  3  are  dif- 
ferent. Then  x  and  y  differ,  and  so  Subseq(a)  <  Subseq(3)  iff  x  <  y. 

Theorem  4.6.  Suppose  a  and  3  are  paths,  and  f  and  g  are 
symbols.  Suppose  that  a  <  3g  in  the  path  ordering,  and  f  <  g  in  the 
symbol  ordering.  Suppose  that  the  last  symbol  of  a  is  not  g.  Then 
af  <  3g  in  the  path  ordering. 

Proof:  Let  f,f0...f  be  the  maximal  subsequence  of  a, 
and  let  g,g,,...g  be  the  maximal  subsequence  of  3g.  Note  that  both 
subsequences  will  be  non-increasing  sequences.  Since  the  last  symbols 
of  a  and  Bg  differ,  the  sequences  f-,fo'"fm   ancl  9i99' ' " 9  (i'1'^er»  so 
f^f2---fm  <  g-j 92 *  •  •  9n  in  the  lexicographic  ordering.  Since  g  is  g 
and  9]92'-'9n  is  a  non-increasing  sequence,  g.  ^g  for  all  i,  l<1<n. 
The  maximal  subsequence  of  af  will  be  f,f«...f.f  for  some  j,  0<j<m.  We 
know  that  f..  <  f  for  all  i,  j<i<m.    If  g-,g2...g.  is  not  identical 
to  f^.-.fj  then  gig2...gn  >  Vz.-fjf   since  g^.-.g,,  >  V2-»V 
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Suppose  g,  g2- . .  g  -  is  identical  to  f-|f2...f..  Then  j  <  n,  because  j  =  n 
implies  a  >  eg  in  the  path  ordering.  Since  f  <  g,  we  have  f  <  g.+,  and 
so  f-|fp...f-f  <  g-igo- •  •9-j9-j+i  •  • -9  n-  Thus  af  <  3g  in  the  path  ordering. 

Theorem  4.7.  Suppose  that  a  and  3  are  paths.  Let  a  =  a'y 
and  3  =  3'y  where  y  is  the  longest  common  suffix  of  a  and  3.  Let 
x  and  y  be  the  maximal  elements  of  Desc(a')  and  Desc($'),  respectively, 
in  the  lexicographic  ordering  on  subsequences.  Then  Subseq(a)  <  Subseq(3) 
iff  x  <  y  in  the  lexicographic  ordering  on  subsequences.  (In  fact,  x 
and  y  are  the  maximal  elements  of  Subseq(a')  and  Subseq(s'),  respectively.) 

5.  A  Path  Ordering  on  Terms 

Given  ground  terms  t-,  and  to,  we  order  them  by  the  multiset  order- 
ing on  their  paths.  That  is,  we  order  paths  a  and  3  by  the  subsequence 
ordering,  i.e.,  a  <  3  iff  Subseq(a)  <  Subseq{3).  We  then  order  t-,  and  t2 
by  the  multiset  ordering  on  Paths(t,)  and  Paths(to).  Thus,  if  List(Paths 
(t-j))  s  {<»]•  a2i  •••  >  am>  and  List(Paths(t2) )  =  {3i»  B2»  •••  »  3n>,  we  say 
t-,  <  t2  if  List  (Paths(t1))  is  a  proper  prefix  of  List(Paths(t2))  or  if 
Subseq(a-)  <  Subseq(B-)  where  j  =  min{i:  a.  f   g. }.  Hence  we  are  really 
using  two  levels  of  multisets  to  order  t,  and  t?:  multisets  of  paths, 
each  path  considered  as  a  multiset  of  sequences. 

Note  that  this  gives  a  well-founded  ordering  on  ground  terms, 
since  we  are  ordering  multisets  of  paths  and  the  ordering  on  paths  is  a 
well-founded  ordering. 

Definition:  The  path  ordering  on  terms  ^s  ::o fined  as  follows: 

a)  If  t,  and  t2  are  ground  terms,  t,  <  t2  if  Paths(t-,)  < 
Paths(t2)  in  the  above  ordering. 
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b)     If  t1   and  to  are  non-ground  terms,   t]   <  to  if  for  all   e  such 
that  t,e  and  t2e  are  ground  terms,   Paths(t]o)  <  Paths(t2e)   in  the  above 
ordering. 

Theorem  5.     The  path  ordering  on  terms  is  a  partial   simplification 
ordering.     Note  that  we  may  have  Paths(t,)=  Paths(t2)  even  if  t,   and  to 
are  not  identical.     For  example,   f(a,b)  and  f(b,a)  have  the  same  multiset 
of  paths.      If  t,   and  t2  are  ground  terms  and  have  unequal   multisets  of 
paths,   then  either  t,    <  t2  or  t2  <  t,    in  the  path  ordering  on  tei 
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6.  Computing  the  Orderings 

Given  paths  a  and  3,  we  present  an  algorithm  to  compute  whether 
Subseq(a)  <  Subseq(3).  Let  |a|  be  the  length  of  a,  i.e.,  the  number  of 
occurrences  of  symbols  in  a.  Let  a[i]  be  the  i   symbol  in  a,  and  assume 
that  the  first  symbol  is  a[l]. 
procedure  order(a,3); 
m  •*-  |a | ; 
n  «-  |e| ; 

while  a[m]  =  3[n]  and  m  >  0  and  n  >0  do  (m  «-  m  -  1;  n  -<-  n  -  1); 
j_f  m  =  0  and  n  =  0  then  return  ("a  =  3"); 

i  <-   1 ;  j  +■   1 

wh i  1  e  i  <_  m  and  j  <  n  do 

(if  a[i]  >  e[j]  then  j  <-  j  +  1  else 

if  a[i]  <  3[j]  then  i  •*■   i  +  1  else 

(1  «-  1  +  1;  j  «-  j  +  1);); 

j_f  i    '  rn  then   return   ("3  >  a")   ejje_  return   ("a  >   3"); 
end  ordf'r; 
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This  method  requires  at  most  |a|  +  1 3 1  -  1  comparisons  to  compute 
the  ordering  on  a  and  3.  It  is  easy  to  show  that  any  comparison-ba^ed 
method  of  computing  this  ordering  requires  at  least  |a|  comparisons  on  in- 
puts a,  3  such  that  a  <  3  in  the  subsequence  ordering  on  paths. 

Wo  now  show  that  the  procedure  "order"  is  correct.  Let  y   be 
the  longest  common  suffix  of  a  and  3.  Suppose  a=a'y  and  3=3'y  .  At 
the  end  of  the  first  while  statement,  m=|a'|  and  n=|e'|.  If  a=3  then 
m=n=o  and  the  procedure  will  return  "a=3". 

We  show  inductively  that  the  second  while  statement,  started 
on  strings  a1  and  3'  which  do  not  have  the  same  last  character,  will 
work  correctly.  If  a'=A  and  3'^A  or  if  a'^A  and  3'=A,  it  is  clear  that 
the  program  works  correctly.  Suppose  a'^A  and  3 ' j^A   Let  a'  be  a-.g-.a2 
and  let  3'  be  3i 9pe2 '  wnere  9i  1S  maxf(a' )  and  #  (a-.)=o.  Also,  g2  is 
maxf(3')  and  #q  (3,)=o.  Thus  we  have  specified  the  first  occurrences  of 
g,  and  g2  in  a1  and  3',  respectively. 

Suppose  g-,  >  g2.  Then  i  will  stop  scanning  at  |a-,|  +1  but 
j  will  continue  all  the  way  through  3'  and  so  eventually  j  >  n  and 
the  algorithm  will  return  "a>3",  which  is  correct.  Similarly,  if 
g2  >  g-i  the  algorithm  will  return  "3>a",  which  is  correct. 

Suppose  g-|=g2-  Tnen  a>3  iff  a2  >  B2»  by  previous  results 
and  the  fact  that  a2  t   32-  We  eventually  have  i  =  |a-,  |  +  1  and 
j  =  |b,|  +  1.   This  causes  the  statement  ("M+l;  W+l)  to  execute. 
From  then  on,  the  while  statement  behaves  as  if  it  were  started  on  a2 
and  Bp.  We  can  assume  inductively,  therefore,  that  the  program  will 
output  "a  >  3"  if  a2  >  32,  and  "a  <  3"  if  a2  <  32-   But  a2  >  32  iff 
a  >  3  and  a2  <  32  iff  a  <  3,  so  the  program  is  correct.  This  completes 
the  proof. 
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The  following  algorithm  orders  two  paths  a  and  B  in  the 
subsequence  ordering  on  paths.  It  scans  a  and  B  from  back  to  front 
rather  than  from  front  to  back.  Notation  is  as  in  the  previous  algorithm. 
procedure  orderb(a,B); 

i«-|a| ; 

while  a[i]   =   B[j]  and_  i>0  and  j>0  do    ("M-l;  J+-J-1); 
if  i=0  and  j=0  th_en  return("a=B") ; 
if  i=0  and  j>0  then   return ( "a<B" ) ; 
if  i>0  anjd  j=0  then  return ( "a>B" ) ; 
LI:     if  i>0  then 

(1+1-1; 

if  a  [i]<3[j]  then  goto  LI  else  goto  L2) 
else  return ("a<3"); 
L2:  if  j>o  then 

(W-lJ 

if  a[i]>B[j]  then  goto  L2  else  goto  LI) 
else  return ("a>B"); 
end  orderb; 

The  reason  this  algorithm  works  is  the  following: 
Let  N(y-|»Yo)  be  the  following  relation  on  paths  y,  and  y «: 
Y,  <  Y„  in  the  subsequence  ordering,  but  Y,  >  (y«  with  the 
first  symbol  removed)  in  the  subsequence  ordering. 

Then  for  all  paths  Y,,  Y2  we  have  that  N(y,,  grJand  f  <  g 
implies  N(fYl,  gy2)-  N(Y-,,  9y2)  and  f  >  g  implies  N(gy2,fYl).   In  the 
above  algorithm,  let  y  be  the  longest  common  suffix  of  a  and  p.  Suppose 
asa'Y  and  B=B'y-  Let  a,   be  a[i  ]a[i+l], .  .a[m]  and  let  3,  be 
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3[jMj+l]...e[n],  where  m=|a'|  and  n=|e'|.  Then  at  LI,  N^,  3-|)  is 
always  true  and  at  L2,  N(3r  a-,)  is  always  true. 

Given  ground  terms  t,  and  t2,  we  compute  the  path  ordering  on  t, 
and  to  as  follows: 

Let  SI  be  Paths^)  and  let  S2  be  Paths(t2).  If  SI  =  S2  then  t, 
=  t~  in  the  ordering.  If  SI  f  S2  and  SI  c  S2  then  t,  <  t2  in  the  ordering. 
If  SI  =f  S2  and  S2  c  SI  then  to  <  t,  in  the  ordering.  Otherwise,  sort  SI  and 
S2  in  non-increasing  order  using  the  above  procedure  to  compute  the  subse- 
quence ordering  on  paths.  We  thus  obtain  Li st (SI  )  =  {a-,,  a?,  ...  ,  a  } 

and  Lis t (S2 )  =  (3i,  32,  ■•■  '  en^'  Let  J  be  mi'n^'1'  :  ai  t  &j}-  Then  ti 
<  t?  in  the  path  ordering  on  terms  if  a,  <  3.  in  the  subsequence  ordering 
on  paths.  Otherwise  t-,  >  to. 

Let  Size(t)  be  the  number  of  occurrences  of  function  and  constant 
symbols  in  a  term  t.  Suppose  Size(t,)  =  q,  and  Size(to)  =  qo-  Then  SI 
has  at  most  q,  elements.  Also,  each  element  of  SI  is  of  length  at  most 
q,.  Hence  each  comparison  done  while  sorting  SI  requires  at  most  2q-,  -  1 

comparisons  of  symbols,  and  we  can  sort  SI  in  0((q,  log  q,)(2q-,  -  1))  or 

2  2 

0 (q-.  log  q-. )  comparisons.  Similarly  we  can  sort  S2  in  0(q2  log  q2)  compari- 

2   2 
sons.  Finding  j  requires  at  most  min(q,,  q2)  comparisons  and  comparing  a. 

and  3.  requires  at  most  q-,  +  q2  -  1  comparisons.  So  the  entire  algorithm 

2         2 
is  of  complexity  0(q,  log  q-,  +  q2  log  q2)  comparisons. 

It  is  oossible  to  do  bettor  thon  this,  l-.'c  present  a  method 
which  can  sort  n  paths  in  time  0(L  min(logk,logn))  where  L  is  the  sum  of 
the  lengths  of  the  paths  and  k  is  the  number  of  distinct  function  symbols. 
We  present  a  related  method  for  computing  the  path  ordering  on  ground 
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terms  t,  and  t«.  The  time  required  is  0(Lmin(logk,logL) )  where  L  is  size  (t^) 
+  size  (t?)  and  K  is  the  number  of  distinct  function  symbols.  Hence 
this  is  a  linear  time  method  if  the  set  of  function  symbols  is  fixed. 

Given  in  paths,  the  following  algorithm  will  output  them  in 
order,  smallest  first,  according  to  the  subsequence  ordering  on  paths. 
The  time  required  is  0(|_  min(logk,  logn))  where  L  is  the  sum  of  the 
lengths  of  the  paths  and  k  is  the  number  of  distinct  function  symbols. 

The  algorithm  considers  v(j)  to  be  the  j   path,  for 
1  <_  j  <  n,  and  v(j)  is  considered  to  be  a  stack.  Thus  f*=v(j)  removes  the 
first  function  symbol  from  v(j)  and  assigns  it  to  f.  The  function 
symbols  are  identified  with  the  integers  {l,2,...,k}.  The  function 
symbols  are  ordered  by  the  usual  ordering  on  the  integers.  Thus 
1<2,  2<3  et  cetera.  The  algorithm  makes  use  of  k  queues  Q, ,  Q?,  ...,  Q. 
together  with  another  queue  T.  The  empty  queue  (or  stack)  is  denoted  by  A. 
Stack  and  queue  operations  are  indicated  in  the  usual  way  by  <=  .  Thus 
j<=T  means  remove  an  element  from  the  queue  and  assign  it  to  j.  Statements 
such  as  T«-A  and  T-eQ.  are  not  queue  operations  but  ordinary  assignment 
statements.  Thus  T«-a  sets  T  to  be  the  empty  queue,  and  T+-Q.  sets  T 
to  be  the  current  configuration  of  the  queue  Q..  The  algorithm  outputs 
a  list  a-j,  a,,,  ...,  a  of  integers  such  that  the  initial  values  of 
v(a,),  v(a~),  ...,  v(a  )  constitute  the  list  of  paths  in  nondecreasing 
order. 

At  any  time  in  the  execution  of  the  algorithm,  the  list  Q,Q2...Qk 
of  paths  is  in  increasing  order,  where  each  queue  is  listed  from  front 
to  back  and  where  we  only  consider  the  subsequence  ordering  on  the  portion 
of  the  path  already  seen. 
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It  would  be  interesting  to  know  if  there  is  a  similar  algorithm 
which  outputs  the  paths  largest  first, 
procedure  sort(v,n); 
flag-^true; 
TVA; 

for  j*-l  to  n  do  T^j ; 
for  j-*-1  to_  k  do  Q.-<-A; 
while  flag  dp_ 
(while  TYA  do 


(J-T; 

if  v(j)=A  then  output(j) 

else  (f-v(j);  Qf<=j);); 

lf(3J)(QjM)  then 

(i-Hnin{j:Q.^A}'»  [Use  a  priority  queue  for  this] 

T^;  Q^A) 

else  flag^-false; ) ; 

end  sort; 

To  show  that  procedure  "sort"  works,  it  suffices  to  show  that 
on  any  pair  v(i),  v(j)  of  input  strings,  it  behaves  the  same  as  procedure 
"order".  For  strings  with  differing  last  characters,  this  is  easy  to  do. 
For  strings  with  common  nontrivial  suffixes,  this  is  more  difficult  but 
still  can  be  done  by  considering  the  relative  position  of  v(i)  and  v(j) 
within  a  queue,  if  they  are  in  the  same  queue. 
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The  following  algorithm  also  sorts  n  paths  v(l),  v(2),  ...,  v(n) 
and  outputs  the  integers  {1,  2,  ...,  n}  in  an  order  corresponding  to 
a  listing  of  the  paths  in  non-decreasing  order.  Notation  is  as  in  procedure  sortj 
The  difference  is  that  the  paths  are  scanned  from  back  to  front  rather 
than  from  front  to  back.  That  is,  f*=v(j)  means  "Let  f  be  the  last  symbol 
in  path  v(j)  and  delete  this  symbol  from  the  end  of  v(j)."  However, 
the  queue  operations  j  *■  Q.  et  cetera  are  as  before.  This  algorithm 
is  more  useful  than  the  preceding  one  in  certain  situations. 

procedure  sortb(v,n); 
for  j=l  to_  k  do_  Q.-Hl; 
for  j=l  to  n  do 

if  v(j)=A  then  output(j)  else 

(f-v(j);  Qf*-J); 

while  (3j)(QjM)  do 

(iVmin{j:Q./A};  [Use  a  priority  queue  for  this] 

if  v(j)=A  then  output(j)  else 
(f-v(j);  Qf-j);); 
end  sortb; 

We  can  verify  that  "sortb"  works  by  showing  that  it  treats 
input  strings  v(i),  v(j)  the  same  way  that  "orderb"  does.  Common 
nontrivial  suffixes  do  not  present  a  problem  because  the  strings  are 
scanned  from  back  to  front. 
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We  present  an  algorithm  which  will,  given  a  set  {t, ,...,t  }  of 

n 
ground  terms,  sort  the  multiset  .*,  Paths  (t- )  in  time  0(Lmin(logk,  logL)) 

n 
where  L  =  .£,  Size(t-)  and  k  is  the  number  of  distinct  function  symbols  in 

the  terms  t . .  For  purposes  of  this  algorithm,  we  associate  subterms  of  a 

term  t  with  prefixes  of  paths  in  t.  If  the  same  subterm  occurs  more  than 

once  in  t,  we  associate  a  prefix  of  a  path  with  each  occurrence  of 

the  subterm  in  t. 

Suppose  t  is  of  the  form  f(v, ,...,v  ).  With  t  itself  we  associate 
the  path  consisting  of  the  single  function  symbol  f.  Suppose  w  is  a 
subterm  of  v. for  sum  i,  l<i<m.  Suppose  that  a  is  the  path  associated  with 
was  a  subterm  of  v..  Then  fa  is  the  path  associated  with  w  as  a  sub- 
term  of  t. 

For  example,  if  t  is  f(g(a,b),c)  then  the  path  associated  with 
b  is  fgb,  the  path  associated  with  g(a,b)  is  fg,  and  the  path  associated 

with  t  itself  is  f. 

n 
Sorting  .*,  Paths (t. )  is  therefore  equivalent  to  sorting  the  occurrences  of 

constant  symbols  in  the  terms  t. ,  1<i<n.  This  is  because  there  is  a 

1-1  correspondence  between  elements  of  Paths(t)  and  constants  in  t, 

using  the  above  association  of  paths  and  occurrences  of  subterms. 

Recall  that  the  top-level  subterms  of  a  term  f(v,,...,v  ) 

form  a  multiset  {v, ,v2, . . . ,v  }.  Assume  the  function  symbols  are  the 

integers  {l,2,...,k},  and  let  notation  be  as  in  procedure  sort. 

The  following  algorithm  will  output  in  non-decreasing  order 
n 
the  elements  of  .*-,  Paths(t. ),  each  path  represented  by  an  occurrence 

of  a  constant  symbol  in  some  term  t . .  We  assume  that  enough  information 

is  kept  with  each  occurrence  of  a  constant  symbol  to  identify  which  path 

it  corresponds  to  when  it  is  printed  out.  The  last  "for"  statement  must 

deal  with  each  occurrence  of  a  top-level  subterm  v  of  t  separately. 
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procedure  sortt({t, ,t2 t  }); 

T«-A; 

for  j^l  ta  k  do  Q--«-A; 

for  j-*-l  to  n  do 

(f^top-level  function  symbol  of  t.; 

Qf-tj)i 

while  (3j)(Qj/A)  do 

(i-rniin  {j:Q-/A);   [Use  a  priority  queue  for  this] 
TVQ.;  Q.+-A; 
while  T^A  do 
(t-T; 

vf  t  is  a  term  consisting  of  a  single  constant  symbol  then 
output(t) 
else 

for  all  top-level  subterms  v  of  t  do 
(f«-top-level  function  symbol  of  v; 
Qf-v););); 


end  sortt; 

This  algorithm  can  be  used  to  compute  the  path  ordering  on  two 
ground  terms  in  linear  time,  assuming  that  the  number  of  function  symbols 
is  bounded.  We  do  not  know  whether  this  technique  can  be  extended  to 
sort  a  set  of  arbitrarily  many  ground  terms  in  linear  time,  assuming 
that  the  number  of  function  symbols  is  bounded. 

The  output  of  procedure  sortt  can  be  simplified,  for  most 

n 
purposes.  Suppose  a. ,  a-,  ...,  a  is  the  list  of  paths  of  .«(.  Paths(t-) 
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in  non-decreasing  order.  Suppose  path  a.  comes  from  term  a.,  a.  e 

J  J     J 

{1,  2,  ...,  n}  for  all  j,l<j<p.  Then  all  we  usually  need  is  the  list 
<a-,,  b,Xa2i  b?>  .  .  .<  a  ,  b  >  where  b,  is  arbitrary  and  b.  =  1  if  a. 
and  a.  -,  are  identical,  b-  =  o  otherwise.  If  path  a.  occurs  in  more 
than  one  term,  we  are  assuming  that  it  is  counted  the  right  number  of 
times  as  coming  from  each  term.  By  scanning  this  list  in  reverse,  we 
can  easily  find  the  largest  term  in  {t-,,  t2,  ...,  t  }  in  linear  time. 
It  is  not  difficult  to  modify  the  procedure  sortt  so  that  it  computes 
the  bits  b.  correctly. 

Define  Paths(t)  for  a  non-ground  term  t  as  follows: 
Paths (x)  =  {x}  for  variables  x 
Paths (c)  =  (c)  for  constant  symbols  c 
Paths(f(tr  ...  ,  tn))  =  {f}  w  Paths(ti) 

Thus  Paths(f(g(x, ,c) ,  x2))  = 
{fgx, ,  fgc  ,fXp}. 

Define  Size(t)  for  a  non-ground  term  t  to  be  the  number  of 
occurrences  of  function  and  constant  symbols  and  variables  in  it. 

Given  not  necessarily  ground  terms  t,  and  t?,  we  compute  the  path 
ordering  on  t-.  and  to  as  follows: 

Let  Paths  (t^  be  s]  u  S^  u  . . .  u  sj  u  R  and  let  Paths (t2)  be 
S^  u  s2  u  ...  u  S^  u  R2,  where  S1  and  R.  are  all  multisets.  Also,  {x] ,  ...  , 
x.  }  is  the  set  of  variables  appearing  in  t-,  or  t2,  and  S.  =  {a:  ax.  e  Paths 
(t.)}.  In  addition,  R.  is  the  set  of  paths  of  t.  ending  with  a  constant  symbol 

Thus,  if  t]  =  f(g(xrc),x2)  then  s]  =  {fg},  S2  =  {f},  R]  =  {fgc}. 
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Theorem  6.1 .  t-,  <  t2  in  the  path  ordering  on  terms  iff  both  of 
the  following  are  true: 


,1 

of  paths. 


1    p 
a)  For  all  j,  1  <_  j  <_  k,  S !  <_  s .  in  the  ordering  on  multisets 


b)  t-|9  <  t?6  in  the  ordering  on  terms,  where  e  is  the  substitu- 
tion replacing  all  variables  by  the  minimal  constant  symbol  of  F. 

The  direct  method  of  applying  this  theorem  yields  an  algorithm  whose  worst 

2         2 
case  complexity  is  0(qj  log  q-j  +  q2  log  q„)  where  q.  =  Size(t-). 

Using  techniques  similar  to  those  mentioned  above,  we  can  compute  the 
ordering  on  non-ground  terms  in  linear  time,  assuming  that  the  set  of 
function  symbols  is  fixed. 

rt!e  now  develop  some  results  leading  up  to  a  nroof  of  the  above 
theorem,  and  present  an  efficient  algorithm  for  computing  the  path  ordering 
on  non-ground  terms. 

Theorem  6.2.  Suppose  S,  *  T,  >  S?  w  T?  in  a  multiset  ordering, 
where  S, ,  T, ,  S?,  and  T?  are  multisets  of  elements  from  some  totally 
ordered  set.  Suppose  T,  >_  T?  in  the  multiset  ordering.  Suppose  a  is  a 
mapping  on  elements  of  T,  and  T?  which  is  1-1,  order  preserving,  and 
monotone  increasing  on  T,  w  T?.  Then  S,  w  (T,a)  >  S?  *  (T^a)  in  the 
multiset  ordering. 

Proof:  It  suffices  to  consider  the  case  in  which  S,  n  S^  = 
0  and  T-j  n  T„  =  0  since  elements  which  occur  the  same  number  of  times 
in  S^  and  S^  or  T,  and  Tp  don't  affect  the  argument.  Assume  then  that 
S1  n  S«  =  0  and  T,  n  T  =  0. 

Let  s,,  Sp,  t-,  ,  and  t~  be  maximal  elements  in  S, ,  S~,  T, , 
and  T„,  respectively.  Then  t-,  >  tp  since  T,  >  T?.  Also,  t-,a  >  t^a 
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since  a  is  1-1  and  order  preserving.  In  addition,  t,a  >  t,  and  t?a  >  t? 
since  a  is  monotone  increasing.  Furthermore,  t,a  and  t«a  are  the 
maximal  elements  of  T,a  and  T2a,  respectively. 

Suppose  that  s,  >_  t, .  Then  s,  >  s2  since  max(s,,  t, )  >_  max(s?,  t?) 
and  S,  n  Sp  =  0.  Also,  t-,a  >  t2a  as  noted  above.  Hence  max(s,,  t,a)  > 
max(s2,  tpa)  so  S,  *  T,a  >  S2  w  T^a. 

Suppose  that  s,  <  t, .  Then  t,  >  s2  since  max(s, ,  t-. )  >_ 

max(s2,  t2).  Hence  t-.a  >  s~   since  a  is  monotone  increasing.  Also, 

t-,a  >  t~a  since  a  is  1-1  and  order  preserving.  Hence  t-,a  >  max(s?,  t?a) 

so  S-j  *  T^  >  S2  W  T2a. 

Corollary:     Suppose  S,  itfV,  m  ...  i«V     >S,*W,  w  ...  mW 
J  vv  1  1  n         2         1  n 

in  a  multiset  ordering  as  above.     Suppose  V.  ^W.  for  j  =  1,  2,    ...,  n. 

J       J 

Suppose  the  mappings  a.  are  1-1,  order  preserving,  monotone  increasing 
mappings  on  V.  *  W,  for  j  =  1 ,  2,  . . . ,  n.  Then  S]  *  v^,  *  V2a2  *  ...  *  Vnan 

2    11    2  2        n  n 

Proof:  By  repeated  application  of  the  above  theorem. 

Proof  of  Theorem  6.1.:  Assume  t-,  <  t2.  Then  by  definition  of 
the  path  ordering  on  non-ground  terms,  t-,6  <  t?e  where  e  is  as  in  the 

theorem.  So  b)  is  true. 

1    2 
Suppose  that  for  some  j,  S.  >  S.  in  the  ordering  on  multisets 

of  paths.  Choose  substitution  a  to  replace  x.  by  a  very   large  term  and 

x-  by  wery   small  terms,  for  i  f   j.  Then t-, a  >  t?a.  Hence  a)  is  true 

if  t-,  <  t2. 

Assume  a)  and  b)  are  true.  We  show  that  t-,  <  t2  in  the  path 

ordering  on  terms.     Let  e'    be  a  substitution  replacing  all   variables 
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x-,,  ...,  x,  by  ground  terms  v-j ,  ...,  vk,  respectively.  We  show  that 
t,e'  <  t-e'  in  the  path  ordering  on  terms. 

Suppose  c  is  the  minimal  constant  symbol  in  F.  We  know  by 
b)  that  Paths(t1)lc)   Paths(t2){c} ,  where  Paths(t] ){c)  denotes  {xc:xe  Paths(t]) 

and  similarly  for  Paths(t2){c}.  By  a),  S.{c}  <  S2{c}  for  1  <  j  <  k. 

1         2 
Hence  n,*S.{c}  5  n.*S.{c}  for  1  <  j  <  k,  where  n,  is  the  number  of  paths 

J    J  J    J  J 

in  Paths(v-).  (Here  n,*S.{c}  means  n.  copies  of  S.{c}  unioned  together, 

J  J    J  J  J 

and  S.{c}  means  {xc  :  XeS.}.) 

J  J 

k     -i  ^     2 

Thus  .*,  n.*S.{c}  w  R  <   .w,n.*S.{c}  *  RQ  by  the  above  remark 
J=l  J  J      1    J=l  J  J      2 

and  b).  Thus  .^  si  Paths  (v.)  *  R]  <  *  S2.   Paths  (v.)  *  R2  by  the  above 

corollary.   (The  mappings  we  are  using  are  those  that  replace  a  path  of 
form  3c  by  a  path  of  form  3y  for  some  y  e  Paths (v^),  y  f   {c}.  Such 

a  mapping  is  1-1,  order-preserving,  and  monotone  increasing,  as  required.) 

k  1 
Note  that  Paths(t,e ')  =  .*,S'.  Paths(v-)  w  R  and  Paths(t?e  )  = 

k 

.*,S.  Paths(v.)  w  R0.  Hence  tne1  <  t«e '  in  the  path  ordering  on  terms. 
J=l  J       J     2         12 

Since  this  is  true  for  all  e1  such  that  t-,61  and  t20  '  are  ground  terms, 
t,  <  t2  in  the  path  ordering  on  terms.  This  completes  the  proof. 

If  t-i  and  t2  are  not  necessarily  ground  terms,  then  we  can 
still  compute  the  path  ordering  on  t-,  and  t?  in  linear  time  using  the 
above  theorem,  assuming  that  the  number  of  function  and  constant 
symbols  occurring  in  t-,  and  t?  is  bounded.  We  perform  a  simple  modification 
to  the  algorithm  for  ground  terms,  as  follows:  Let  x.,  S.,  and  e  be 
as  in  the  above  theorem.  We  modify  the  algorithm  so  that  it  prints 
out  two  lists: 

1.  The  paths  of  Paths(t-,e)  *  Paths(t2e)  in  non-decreasing 
order,  each  path  identified  as  coming  from  t,  or  t?. 
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The  paths  inj^sj)  *[jVJ]  in 


2.  The  paths  i n [.,*■, S .;  |  *|  H^-, SH  |  in  non-decreasing  order, 
where  each  path  is  identified  as  belonging  to  a  particular  set  S.. 
That  is,  the  values  of  i  and  j  are  also  given. 

Both  of  these  lists  can  be  computed  during  the  same  scan  of  t, 
and  t?.  Also,  by  reading  these  lists  in  reverse  order,  it  can  easily 
be  determined  in  linear  time  whether  the  conditions  of  the  theorem 
are  true.  Hence  we  can  compute  the  ordering  in  linear  time,  assuming 
that  the  number  of  function  symbols  is  bounded. 

A  slight  subtlety  in  this  and  the  preceding  algorithm  is  that 
when  scanning  the  output  in  reverse,  we  don't  really  need  to  know  the 
whole  path  for  each  path  on  the  list.  All  we  need  to  know  is  whether 
the  path  is  identical  to  the  preceding  element  of  the  list.  Hence  the 
lists  can  be  printed  and  scanned  in  linear  time,  since  we  only  need 
a  constant  amount  of  information  per  path  (assuming  that  the  relevant 
integer  indices  don't  get  too  big). 

The  replacements  that  the  path  ordering  can  handle  are   very 
similar  to  those  which  Dershowitz's  nested  multiset  ordering  can  deal 
with  [4].    However,  the  path  ordering  can  deal  with  more  than  one 
function  symbol  being  "pushed  in"  at  the  same  time. 

7.  Examples 

The  following  replacements  are  all  simplifications  in  the  path 
ordering  on  terms.  Assume  function  symbols  are  ordered  alphabetically. 
Thus  a  <  b,  b  <  c  et  cetera. 
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g(f(x,y))  -  f(g(x),g(y)) 
e(d(x,y))  ■*  d(e(x),e(y)) 
g(f(x,y))  •>  ff(g(e(x)),g(e(y))) 
g(f(x,y))  -  f(f(g(x),g(y)),f(g(x),g(y))) 
g(e(x))   +  f(x,g(x)) 
e(c(x))   -  d(x,e(x)) 
Note  that  the  same  ordering  can  handle  aVj_  of  these  replacements 
at  the  same  time. 

8.  General  Characterization 

We  now  present  a  method  of  obtaining  a  fairly  general  class 
of  simplifications  in  the  path  ordering  on  terms. 

Definition:  If  s  and  s  are  ground  terms,  then  the  replacement 
?,  -»  s?  is  a  strong  simplification  in  the  path  ordering  if  the  maximal 
element  of  Paths  (s?)  is  less  than  the  maximal  element  of  Paths  (s-i) 
in  the  ordering  on  paths. 

Definition:  If  s,  and  s?  are  not  necessarily  ground  terms,  then 
the  replacement  s -,  -*  So  is  a  strong  simplification  in  the  path  ordering  if 
for  all  substitutions  o  such  that  s  e  and  so  are  ground  terns,  s.o  -»  s?o 
is  a  strong  simplification  in  the  path  ordering. 

Examples:  The  following  replacements  are  all  strong  simplifications 

in  the  path  ordering  on  terms,  assuming  a  <  b,  b  <  c,  et  cetera: 

f(x,y)  -»  x 
e(d(x,y))  -  e(x) 
e(d(x,y))  ■*  e(c(x)) 

In  general,  if  s   is  a  subterm  of  s  then  s  -»■  s  is  a  strong  simplification 

in  the  path  ordering.  Also,  the  preceding  six  examples  of  simplifications 

are  all  strong  simplifications  in  the  path  ordering. 
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Theorem  8.1 .  Let  notation  be  as  in  the  theorem  characterizing 

the  path  ordering  on  non-ground  terms.  Then  t2  -*•  t^  is  a  strong 
simplification  in  the  path  ordering  iff  both  of  the  following  conditions 

are  true: 

a)  For  all  j,  l<jsk,  the  maximal  element  of  S-  is  less  than 

o 

the  maximal  element  of  S.  in  the  ordering  on  paths. 

•J 

b)  The  maximal  element  of  Paths  (tjO)  is  less  than  the  maximal 
element  of  Paths  (t-O)  in  the  ordering  on  paths. 

We  have  the  following  characterization  of  a  frequently  occurring 
class  of  replacements,  all  of  which  are  simplifications  in  the  path  ordering 
on  terms: 

Theorem  8.2.  Suppose  t-istoL.-jt  and  t  are  terms,  possibly  with 
variables  in  them.  Suppose  h  is  the  top-level  function  symbol  of  t.  Also, 
suppose  that  t  -*  t .  is  a  strong  simplification  in  the  path  ordering,  for 
i  =  l,2,...,n.  Suppose  {f-,,...,f.}  is  a  set  of  function  and  constant 
symbols  all  of  which  are  simpler  than  h  in  the  symbol  ordering.  Let  u  be 
any  term  formed  from  any  number  of  occurrences  of  t-.  »tps  •  •  •  »t  and  any 
number  of  occurrences  of  the  symbols  {f, ,...,f.}.  Then  the  replacement 
t  ■*  u  is  a  simplification  (in  fact  a  strong  simplification)  in  the  path 
ordering  on  terms. 

Thus  we  can  deal  with  replacements  such  as  the  following: 

fact(s(x))  -*•  s(x)*fact(x) 
count(cons(x,yj)  ->  count(x)  +  count(y) 
h(g(x),g(y))  -»  h(x,y) 

h(g(x),c(y))  -  f(h(x,y),h(x,y)) 

h(g(x),g(y))  ->  f(h(e(x),e(y)),n(e(x),e(y))) 
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We  still  cannot  deal  with  replacements  in  which  only  one  argument  of 
h  gets  simpler,  however.  Here  is  an  example  of  such  a  replacement: 

h(g(x),g(y))  -  e(h(g(x),y)) 
If  we  could  find  a  general  ordering  in  which  such  replacements  were 
simplifications,  then  we  could  probably  handle  the  distributive  replace- 
ments of  multiplication  over  addition. 

The  following  result  helps  us  to  prove  the  above  theorem: 

Theorem  8.3.  Suppose  a  and  3  are  paths  such  that  Subseq  (a)  < 
Subseq  (3).  Let  f  be  a  function  symbol  which  is  less  than  the  first 
function  symbol  of  3  in  the  symbol  ordering.  Then  Subseq  (fa)  <  Subseq  (3) 

Corol lary:  Suppose  a  and  3  are  paths  such  that  Subseq  (a)  < 
Subseq  (3).  Suppose  3  is  a  path  composed  entirely  of  function  symbols 
which  are   less  than  the  first  symbol  of  3  in  the  symbol  ordering.  Then 
Subseq  (got)  <  Subseq  (3) . 

We  still  cannot  deal  with  the  following  two  useful  replacements 

using  this  ordering: 

(x+y)  +  z  -  x  +  (y+z) 
x  *  (y+z)  ^x*y+x*z 

We  are  developing  other  methods  which  can  handle  these  replacements  in 
addition  to  replacements  such  as  those  discussed  above.  In  fact,  we 
have  recently  developed  methods  which  can  handle  the  second  of  the 
above  replacements,  in  addition  to  replacements  similar  to  those  in- 
cluded in  the  above  characterization. 
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9.  Summary  and  Conclusions 

We  have  described  a  simple  class  of  well-founded  partial 
orderings  which  can  be  used  to  obtain  short  proofs  of  termination  of 
systems  of  rewrite  rules,  in  many  cases  which  appear  to  be  of  practical 
interest.  These  "path  of  subterm"  orderings  are  based  on  multisets 
of  sequences  of  function  symbols  from  the  terms  we  are  working  with. 
We  have  given  explicit  programs  which  can  decide  whether  s  <  t  in 
such  an  ordering,  given  terms  s  and  t.  These  programs  run  in  linear 
time,  assuming  that  the  set  of  function  symbols  is  fixed.  Also,  we  have 
described  a  general  class  of  replacements,  all  of  which  are  simplifi- 
cations in  the  ordering  presented.  We  have  presented  some  limitations 
of  this  method  of  partial  ordering.  In  future  work  we  plan  to  discuss 
recent  results  extending  these  ideas  to  overcome  some  of  the  limitations 
of  the  path  of  subterms  ordering,  at  the  expense  of  some  simplicity. 
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